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Abstract 

We consider the enciphering of a data stream 
while being compressed by a LZ algorithm. 
This has to be compared to the classical en- 
cryption after compression methods used in 
security protocols. Actually, most cryptanal- 
ysis techniques exploit patterns found in the 
plaintext to crack the cipher; compression 
techniques reduce these attacks. Our scheme 
is based on a LZ compression in which a Ver- 
nam cipher has been added. We make some 
security remarks by trying to measure its ran- 
domness with statistical tests. Such a scheme 
could be employed to increase the speed of 
security protocols and to decrease the com- 
puting power for mobile devices. 

Cryptography, compression, pseudo-random 
sequences, security. 

Introduction 

Information security is currently one of the 
main challenges in computer networks. In the 
emergent communication paradigm where 
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wireless and wired networks are interoperat- 
ing, security issues become crucial. Tradi- 
tional technologies are every day more inad- 
equate and existing standards should be im- 
proved for use in resource restricted environ- 
ments. We aim to develop a secure algorithm 
for confidentiality, but cheaper in terms of 
size and computing power. 

In many security protocols, a compres- 
sion algorithm is run prior encrypting the 
data to increase the security and the band- 
width. These algorithms are run on the orig- 
inal stream. They all stem from research by 
J. Ziv and A. Lempel who have designed two 
compression algorithms: LZ77 and LZ78 [5]. 
After compression, if the speed of computa- 
tion is taken into account, the compressed 
data is enciphered with the use of a stream 
cipher like RC4 (let us recall that RC4 is 15 
times quicker than a 3DES and is used in pro- 
tocols like WEP and SSL j4j). 

In the present paper, we propose to scram- 
ble (encipher) a data stream while it is be- 
ing compressed. We assume the reader fa- 
miliar with classical compression algorithms 
and with secret key cryptography for which a 
good introduction is [6j. The paper is orga- 
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nized as follows: section [T] presents the basis 
of our idea, while section [2] recalls the related 
results. In section [3] we illustrate our idea 
with a "toy" implementation which uses a 
compression algorithm from the Lempel-Ziv 
family. Some statistical tests have been made 
and are presented in section |H 



pression algorithm by a Vernam cipher and to 
do this while the data stream is being com- 
pressed in order to avoid to pass the com- 
pressed data stream to another encryption 
process. From the above discussion, the out- 
put of our scheme should be almost random. 



1 The idea 



2 Related work 



Our idea is to encipher the data stream 
while it is being compressed by a lossless 
dictionary algorithm. The basic idea which 
motivates this proposition is that a com- 
pressed stream is already almost random and 
a good candidate to be scrambled by a sim- 
ple Vernam cipher. This comes from the no- 
tion of incompressibility introduced with Kol- 
mogorov complexity. A.N. Kolmogorov |2j 
has proposed a complexity which speaks 
about objects rather than the usual classes 
of languages addressed by classical complex- 
ity. Informally, Kolmogorov complexity cor- 
responds to the size of the smallest program 
p which can print out on its standard out- 
put the object x. If ^p < jjx, we say that 
X is compressible, otherwise incompressible. 
It provides a modern notion of randomness 
dealing with the quantity of information in 
individual objects which says that an object 
X is random if it cannot be represented by a 
shorter program p whose output is x or, in 
other words, if x is incompressible [5]. From 
this point of view, the output of any com- 
pression algorithm is an approximation of a 
random sequence, although highly reversible. 
Our idea is to scramble the output of a com- 



Actually, there are two methods sharing the 
same idea but in a slightly different way. The 
first one is called concryption and has been 
patented by Security Dynamics (US Patent 
#5479512). It is a method for the inte- 
grated compression and encryption (concryp- 
tion) of clear data. For concryption, the clear 
data and an encryption key are obtained, 
at least one compression step is performed 
and at least one encryption step is performed 
utilizing the encryption key. The encryp- 
tion step is preferably performed on the fi- 
nal or intermediate results of a compression 
step, with compression being a multistep op- 
eration. The second method is called com- 
pryption pQ and is due to R. E. Crandall 
when he was Apple's Chief Cryptographer. 
Roughly, his idea is to index a great number 
of entropy compression algorithms by a se- 
cret key. He then gets a holistic (one-pass) 
compress/encrypt algorithm. This method is 
currently used for enciphering the passwords 
in the keychain application starting with Mac 
OS 9 and still used in Mac OS X from Apple. 
It is recorded under US Patent #6154542, 
"Method and apparatus for simultaneously 
encrypting and compressing data". 



3 The proposed scheme 

We use the mode of operation of LZ 78 which 
uses a growing dictionary [5]. It starts with 
2^ = 512 entries (with the first 256 entries 
already filled up, eventually after an initial 
permutation). While this dictionary is in 
use, 9 bit pointers are written onto the out- 
put stream after encryption by a Vernam ci- 
pher. When the original dictionary is filled 
up, its size is doubled to 1024 entries and 10 
bit pointers are then used (and encrypted as 
well) until the pointer size reaches a maxi- 
mum value set by the user. When the large 
dictionary is filled up, the program contin- 
ues without changes to the dictionary but 
with monitoring the compression ratio. If 
this ratio falls down a predefined threshold, 
the dictionary is deleted and a new 512 en- 
tries dictionary is started. The algorithm be- 
low presents the scheme. In the sequel, we 
denote by PRBS a pseudo-random Boolean 
sequence. 

Index = 256; Length = 9; Word = null; 
Limit = 12; 

Initialise 256 inputs in Dictionary 
//(eventually after a permutation) 
//(a+b stands for concatenation) 
REPEAT 
read S 

//(Read a symbol from the stream) 
IF Word+S is in Dictionary 
THEN 

Word = Word+S; 
Emit = false 
ELSE 

Output (index of Word) XOR (PRBS) 
// Vernam cipher 



Index of (Word+S) = Index; 
Index++ 

IF Length = Limit 
THEN 

Re-initialise Dictionary 
ENDIF 
IF Index = 2Length 

THEN Length++ 
ENDIF 

Word = S; Emit = true 
ENDIF 
UNTIL no data found 
IF Emit = false THEN 

Output the (index of Word) XOR (PRBS) 
// Vernam cipher 
ENDIF 

The implementation was just made as a 
proof-of-concept in C++ and using the LEDA 
library U which provides a sizable collection 
of data types and algorithms in a form which 
allows them to be used by non-experts. 

The difference with concryption is that we 
use a single pass compression algorithm while 
they require the compression to be a multi- 
step operation, and it is not based on an en- 
tropy compression algorithm used in the com- 
cryption method, although an entropy com- 
pression algorithm could be added to shorten 
the mostly used pointers which are returned 
by the algorithm. 

4 Analysis 

Though a plot of the output (see figure [1]) is 
rather encouraging, we were deceived while 
testing outputs with a x^ f^st for which the 
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The use of compression and encryption 
mixed together should increase the band- 
width, decrease the latency as well as it also 
might decrease the energy consumption re- 
quired for the same purpose when using en- 
cryption after compression for mobile devices 
or RFID. 



Figure 1: Probabilities of the output. 

results were a little bit too weak. This may be 
explained by the rather bad choice of a "toy" 
linear feedback shift register for generating 
the PRBS. We expect the result to be im- 
proved with the use of good pseudo-random 
generators like RC4, or even with so-called 
"perfect" PRBS generators. 

Further testing should be made according 
to ^ which requires a PRG to pass a num- 
ber of statistical tests or the Marsaglia tests, 
a set of 23 very strong tests of randomness 
implemented in the Diehard programcl- 



5 Discussion 

Although not truly pseudo-random (but this 
is also not a pseudo-random generator), the 
output of our compression and encryption 
scheme is encouraging if we look at the typ- 
ical output depicted by figure [U Further 
study should be made with the help of a good 
pseudo-random generator with classical tests 
and a fine tuning of all the parameters. 
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